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BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to processing of digitized speech and more 
particularly to compression of voice data to reduce bandwidth required to 
transmit the speech over digital transmission media while preserving 
perceptual speech quality. 

2. Background of the Art 

With the current growth of digital transmission and the convergence of 
voice and data networks world-wide, digitized speech signals place increasing 
bandwidth burdens on digital networks. Existing fixed and variable rate 
speech compression techniques suffer from poor speech quality in the 
reconstructed speech and lack the flexibility to adapt dynamically to changing 
network bandwidth constraints. 

Contemporary digital transmission environments beneficially 
accommodating variable data rates include multi-channel long-haul telecom, 
and voice over Internet Protocol (IP) applications. 

The current trend in IP networks toward a quality-of-service (QoS) based 
rate structure is supported to only limited extents by existing voice 
compression systems, which generally offer a limited range of data rates and 
output speech quality. 

SUMMARY 

The invention relates to a device that includes an encoder. The encoder 
compresses a plurality of signals at variable rates based on a plurality of 



Docket No. 05313P002 



Express Mail No. EM14067308US 



prioritized parameters to reduce signal bandwidth while preserving perceptual 
signal quality. 

Also the invention relates to a device that includes a decoder. The 
decoder decompresses a plurality of compressed signals at variable rates based 
on a plurality of prioritized parameters to reduce signal bandwidth while 
preserving perceptual signal quality. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The invention is illustrated by way of example and not by way of 
limitation in the figures of the accompanying drawings in which like references 
indicate similar elements. It should be noted that references to "an" or "one" 
embodiment in this disclosure are not necessarily to the same embodiment, and 
such references mean at least one. 

Figure 1 illustrates a block diagram of an embodiment of the invention 
having a Variable Rate Speech Encoder. 

Figure 2 illustrates a block diagram of one embodiment of a Variable 
Rate Speech Decoder. 

Figure 3 illustrates a signal flow diagram of an Epoch Locator portion of 
the Encoder illustrated in Figure 1. 

Figure 4 illustrates a signal flow diagram of Primary Epoch Analysis 
operations in the Encoder illustrated in Figure 1. 

Figure 5 illustrates a signal flow diagram of a Secondary Epoch Analysis 
portion of the Encoder illustrated in Figure 1. 

Figure 6 illustrates a signal flow diagram of an Excitation Generator 
portion of the Decoder illustrated in Figure 2. 
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Figure 7 illustrates a signal flow diagram of Synthesizing Filter 
segments of the Decoder illustrated in Figure 2. 

Figure 8 illustrates a signal flow diagram of an embodiment having 
Output Scaling and Filtering portions of the Decoder illustrated in Figure 2. 
5 Figure 9 illustrates a structure of a bit stream sent over a digital channel. 

DETAILED DESCRIPTION OF AN EMBODIMENT 
The invention generally relates to the efficient transmission of digitized 
speech while preserving perceptual speech quality. This is accomplished by 
10 using an Encoder at the transmitting end and Decoder at the receiving end of a 
digital transmission medium. Referring to the figures, exemplary 
embodiments of the invention will now be described. The exemplary 
embodiments are provided to illustrate the invention and should not be 
construed as limiting the scope of the invention. 

15 

Figure 1 illustrates a block diagram of an Encoder in one embodiment of 
the invention. The Encoder comprises Epoch Locator unit 10 to identify 
segments of an input signal for further analysis. Primary 30 and Secondary 50 
Analysis units to extract parameters that describe signal segments and 

20 associate a priority value with each parameter, and Frame Assembly unit 60 to 
prepare the parameters for transmission. 

While the following discussion relates to the variable rate transmission 
and reception of compressed speech signals over a digital transmission 
medium, one should note that other types of signals can benefit from the 

25 embodiments of the invention also, such as audio associated with video 

streaming signals. In a transmitting telephone, an input chaimel of speech 



3 



Docket No. 05313P002 



Express Mail No. EM14067308US 



generally originates as an analog signal. In one embodiment, this signal is 
converted to a digital format (by an Analog to Digital converter) and presented 
to the Encoder. The conversion from analog to digital formats may take place 
in the immediate physical vicinity of the Encoder, or digital signals may be 
forwarded (e.g. over the Public Switched Telephone Network (PSTN)) from 
remote locations to the Encoder. When encoding (compressing) a given 
chaimel of digitized speech, frames of output (channel) data appear at the 
output of the Encoder at a variable rate that is determined by activity in the 
input audio signal. In one embodiment, each frame of data sent to the digital 
transmission medium consists of an encoding of (typically) 15 parameters 
describing an epoch (segment) of the input audio signal. 

The Encoder compresses speech at a variable rate, which allocates available 
bandwidth to those portions of the digital signal that are most significant 
perceptually. The parameters that describe an epoch are ordered from most 
important to least important in their influence on perceived speech quality and 
a Priority Value is associated with each parameter detailing its importance in 
the current audio context for reconstructed speech audio quality. The priority 
flags are not sent to the receiving end, but are used in one of two ways: 

(1) Other systems, external to the present invention, which manage 
the traffic over the digital medium may use the Priority Values to drop 
parameters from the transmitted bit stream thus further reducing 
bandwidth with minimal impact on speech quality. 

(2) Other systems, external to the present invention, which manage 
the traffic over the digital medium may signal the present invention to 
use the Priority Values to drop parameters from its output bit stream 
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thus further reducing bandwidth with minimal impact on speech 
quality. 

In situations in which the Encoder and traffic management systems are 
physically co-located or share a high bandwidth interface, it may be 
advantageous to employ the first method. Such systems include the Network 
Manager scenario described in copending patent application entitled 

TELECOMMUNICATION DATA COMPRESSION S/N filed on 

• In situations in which the Encoder and traffic management 

systems are not physically co-located or share only a low bandwidth interface, 
it may be advantageous to employ the second approach. Such systems include 
cellular telephone networks in which the Encoder would advantageously 
reside in the end user's cellular telephone while network traffic management 
functions would be performed centrally or at the cell level in the network. 

Figure 3 illustrates signal flow in Epoch Locator 10. In one embodiment 
Epoch Locator 10 identifies segments (epochs) in input speech that correspond 
to individual periods of a speaker's pitch. During intervals of voiced speech 
(when the speaker's vocal chord is vibrating and sending pulses of air at a 
regular rate into the upper vocal tract, either real-time or synthesized) Epoch 
Locator 10 identifies the points at which these pulses occur. During intervals of 
unvoiced speech (when the vocal chords are not active or synthesized speech is 
not active) Epoch Locator 10 identifies random segments for analysis. The 
identification of the putative pulse locations involves detecting sudden 
increases in relative signal energy. The Epoch Locator signal flow described 
here is a modification of the pitch tracking described in U.S. Patents 4,980,917 
and 5,208,897. 



Docket No. 05313P002 



Express Mail No. EM14067308US 



Illustrated in Figure 3, Full Wave Rectifier 11 operates on the Input 
Audio Signal time series, {S^}, by taking the mathematical absolute value to 
produce the time series { | | } in one embodiment. The time series or signal 
{SJ is assumed to represent a standard PSTN speech signal sampled at 8,000 
samples per second and converted from the PSTN standard of Mu-law or A- 
law encoding to a linear 12 bit format. In one embodiment. Cube and Smooth 
Operations 12 operate on { | S^ | } to produce the time series {YJ according to the 
following equation: 



In one embodiment, Log2 operation 13 operates on {Y^} to produce {y^} 
according to the following equation: 



In one embodiment. Difference Over 11 Samples Operation 14 operates 
on {y„} to produce {D„} according to the following equation: 

Dn = Yn - ( yn-1 + Jn-l + Jn-S + YnA + Jn-S + Jn-e + Yn-V + Yn-S + Yn.9 + yn-lo)/10 (Eq. 3) 

In one embodiment. Clamp and Smooth When Falling operation 15 
operates on {D„} to produce {x„} according to the following equations: 



Y„ = (15*Y„.i + ( Minimum(2047, |S„|)^ )/2048)/16 



(Eq. 1) 



y„= 32=<-Log2(YJ 



(Eq. 2) 



D'„ = Maximum( Minimum( 64, D^), -128) 
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{ {4'^D'„+7=^x',.i)/8 



if 4'^D'„<xVi andxVi> 32 



x', = { (4^^ D', + 15* xVi)/16 if 4=^ D', < x'^.i and x'^.^ <= 32 



if 4* >= x'„.] 



x„= x'n/4 



(Eq.4) 



In one embodiment. Local Maximum Follower 16 operates on {xj to 
produce {MJ according to the following equations: 
If x„ > M',_i 

M'n = x„ 

M"„ = 16* M\ + 8 

If x„ <= M'„_i 

M'„ = M"„.i / 16 
M", = M'Vi - M'„ 

M„ = { M\ if M'„ >=1 



In one embodiment. Difference Over 5 Samples operation 17 operates on 
{MJ to produce {t^} according to the following equation: 



{1 



if M'„ < 1 



(Eq. 5) 



t^= -[( M„.i + M,.2 + M,.3 + M„.4 )/4] -3 



(Eq. 6) 



The signal {t^} generally shows sharp positive going peaks at the pulse 
locations. The signal {t^,} is stored in Trigger Buffer 18 for later use as the 
primary driver of Epoch Triggering Logic 25. 



Docket No. 05313P002 



Express Mail No. EM14067308US 



The raw indications of possible pulse locations reflected in Trigger 
Buffer 18 are subject to errors as a result of noise in the input signal. To counter 
the effect of the noise on pulse location accuracy, in one embodiment an 
Average Magnitude Difference Function (AMDF) is computed once every 64 
samples. The nulls in this function occur at points that correspond to strong 
periodicities in the input signal. In one embodiment, prior to computing the 
AMDF the input audio signal {SJ is subjected to Low Pass Filter 19 to produce 
a signal {ZJ according to the following equations: 

z'„ = 0.5928955 * S„ + 0.0849914 * z'^.^ + 0.5928955*S„.i 
z'\ = 0.8" z\ 

t!'\ = 0.5928955 * z"„ + 0.0849914 * z"Vi + 0.5928955* z"„.i 
= 0.8 * z"\ (Eq. 7) 

The AMDF function values to be used while processing triggers for 
samples N to N+63 are computed from {zj as 49 values {a'],:k=0,l,2,...48} as 
follows: 

49 

a k= S I ZN+j-halflag(k)+2 2+j+halflag(k)+2 | (Eq. 8) 



where in one embodiment, halflagO is given by Table 1. 
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The values of halflagO are roughly uniformly spaced on a logarithmic 
1 5 scale. The actual lag values used in the AMDF are 2*halflag() and span the 
range from 16 to 192. The range of 16 samples to 192 samples corresponds to 
possible pitch frequencies of 500Hz down to 41.7Hz at the 8,000 Hz sampling 
rate. 



20 In one embodiment, the Raw AMDF { a.\ } is then normalized to produce 

{ aj, } as follows: 

MaxMag = Maximum({ a\ }) 
MinMag - Minimum({ a\ }) 
Range = MaxMag - MinMag 

25 

ai,= (10*( a'k - MinMag)) /Range for k = 0,1,... ,48 (Eq. 9) 



9 
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The Normalized AMDF { a^ } has values ranging from 0 to 10 with the 
zeroes or nulls at points corresponding to the lags (frequencies) exhibiting the 
most pronounced periodicities in the low pass filtered version of the input 
5 signal. The null point with the lowest index (highest frequency) is then 
widened by setting the two neighboring points on either side to zero. By 
definition the first null begins at index p and extends to index q that is 

aj,= 0 for k in p to q 

and 

10 ak> 0 for k<p 



The null is widened by the following operation: 
ap.i = 0 if p>0 
ap.2=0 if p>l 
a^^i = 0 if q<47 

aq^2=0 if q<46 (Eq. 10) 



In one embodiment Extrapolate to Linear Time Scale operation 21 is then 
performed to construct an AMDF approximation {A^; k=0,l/-.-,219} on all 
20 possible lag values from 0 to 200 with the following operation (expressed in C 
programming code): 

k=0; 

for(j=0;j<221;j++) 
{ 

25 if( (j>2*halflag(k)) && (k<48) )k++; 

A[j] = a[k]; 
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The AMDF approximation {Ai,; k=0,l/- •■,220} is then written to AMDF 
Buffer 22 for use in Epoch Triggering Logic 25. 

5 

In one embodiment Epoch Trigger Logic 25 also employs an RMS (root 
mean square) estimate {erms^} computed from a HighPass Filtered version of 
the Input Signal [SJ. High Pass Filter 23 computes a signal {pj from {S^} as 
follows: 

10 

p„= 0.8333 =^(S,-S„.i + 0.4 *S,.2) (Eq. 12) 

In one embodiment Estimate RMS function 24 computes {erms^} from 
{pj according to the following equation: 

15 

ermsn = (127'^ermSn.i + Pn)/128 (Eq. 13) 



Epoch Triggering Logic 25 examines the trigger buffer and the AMDF 
20 approximation in the AMDF buffer to determine if the start of a new Epoch 

should be declared at a point, n, in time where n falls in the range N to N+63 to 
be used with the current contents of the AMDF buffer computed as in Eq. 8 
above. In the Epoch Triggering Logic a variable, PeriodSize, is defined as the 
time in samples since the most recent trigger (epoch start). In one embodiment 
25 two trigger signals are considered. The first is simply the trigger signal 

recorded in Trigger Buffer 18; the second is the value from Trigger Buffer 18 
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plus 2 and minus the corresponding value from AMDF Buffer 22. The 
operation of adjusting by the AMDF value serves to pull down spurious 
triggers which do not correspond to strong periodicities in the input signal. 
The Epoch Triggering Logic computes these two trigger signals for the current 
5 point n and for 19 points (n+j; j=l to 19) in the future. If a trigger point appears 
in the near future that is stronger than the current point, triggering at the 
current point is suppressed to wait for the stronger trigger. To this end the 
following computations are performed to construct arrays of the trigger values 
{tr^; k=0 to 19} and adjusted trigger values {tak;k=0 to 19} for the current point 
10 and 19 points in the future, recalling from Eq. 6 that Trigger Buffer 18 contains 
the signal {t^} and from Eq. 11 that the AMDF Buffer contains the signal {Ai,; 
k=0 to 220}: 



for k=0 to 19 



15 



tak = t^^k+2-Ai 



^PeriodSize+k 



for k=0 to 19 



Maxtr = Maximum(trk) 



Maxta = Maximum(tak) 



(Eq. 14) 



In one embodiment trigj 



;ering (declaring the start of a new epoch) 



20 occurs when the following conditions are met: 



PeriodSize = 200 OR 



( (Maxtr <= tro + 5 OR Maxta <= tao) AND (tro >4 OR tao >=0) AND 



25 



PeriodSize>=16 ) 
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When triggering occurs an addition is made to the next available space 
in Epoch Log 26 to record the location, n, at which the trigger occurred, the 
time, PeriodSize, since the previous trigger (the Epoch Length), and the value 
of estrmSn as computed in Eq. 13. 

In one embodiment whenever the current value of PeriodSize plus the 
sum of the Epoch Lengths in the Raw Epoch Log exceeds 344 samples. Epoch 
Smoothing and Combining operation 27 is activated. Epoch Smoothing and 
Combining 27 creates Epoch Log 28 from Raw Epoch Log 26 by examining and 
modifying the first few entries in Raw Epoch Log 26 and then dispatching the 
first Epoch in Epoch Log 28 to Primary Epoch Analysis unit 30. 

By definition Raw Epoch Log 26 is a structure with N entries and three 
fields: Location, Length, and EstRms, that is: 

RawEpochLog.Locationj, for k =0,1,. . .,N-1 
RawEpochLog.Lengthk for k =0,1,. . .,N-1 
RawEpochLog.EstRmS], for k =0,1,. . .,N-1 

Epoch Log 28 is a similar structure that is initially set equal to Raw 
Epoch Log 26, that is: 

EpochLog.Locationj, = RawEpochLog.Locationj, for k=0,l,...,N-l 
EpochLog.Lengthj, = RawEpochLog.Length;, for k=0,l,...,N-l 
EpochLog.EstRmSk = RawEpochLog-EstRmSj, for k=0,l,...,N-l (Eq. 15) 

In one embodiment Epoch Smoothing and Combining 27 comprises 6 
operations, the first two of which are designed to enhance speech quality by 
smoothing (correcting presumed errors) in successive Epoch Lengths, the next 



13 



Docket No. 05313P002 



Express Mail No. EM14067308US 



3 of which are designed to combine epochs in the interest of reducing channel 
bit rate by reducing frame rate, and the last one of which erihances quality by 
extending the epoch length pattern indicative of voiced speech for a short 
distance into the following unvoiced speech area. Each operation operates on 
5 and potentially modifies Epoch Log 28 as constructed in Eq. 15 above. 

In one embodiment in one operation of Epoch Smoothing missed 
triggers are hypothesized and inserted into the log. The conditions for 
executing this operation are: 



10 



EpochLog.Lengthi < 200 AND 



NearTo(EpochLog.Lengtho, EpochLog.Lengthi/ 2, 1.3) AND 



NearTo(EpochLog.Length2, EpochLog.Lengthi/ 2, 1.3) 



15 



Where the function NearTo(a,b,z) is defined as follows: 



NearTo(a,b,z) = { True if Max(a,b)/Min(a,b) <= z 



{ False otherwise 



20 



When these conditions are met the following modifications are 



performed to split the second log entry into two entries: 



Shift log entries with indicies >=2 1 slot higher 



EpochLog.Lengthj = EpochLog.Lengthi/2 



25 



EpochLog.Lengthi = EpochLog.Lengthi- EpochLog.Length2 



EpochLog.EstRms^ = EpochLog.EstRmSi 
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EpochLog.Locatioris = EpochLog.Locatiorii 
EpochLog.Locationi = EpochLog.Locationo + EpochLog.Lengthi 
N = N + 1 

In another operation of Epoch Smoothing, assumed false triggers are 
5 removed and combined with neighboring epochs. The conditions for executing 
this operation are: 

EpochLog.Lengthj + EpochLog.Lengthj < 200 AND 

NearTo(EpochLog.Lengthfl, EpochLog.Lengthi + EpochLog.Lengthj/ 1.3) 

10 AND 

NearTo(EpochLog.Length3, EpochLog.Lengthi + EpochLog.Length2, 1.3) 

When these conditions are met the following modifications are 
performed to combine the epochs at indices 1 and 2 into a single epoch: 

15 

EpochLog.Lengthj ^EpochLog.Lengthi + EpochLog.Length2 
EpochLog.Locatioui ^ EpochLog.Location2 
Shift log entries with indicies >=2 1 slot lower 
N = N-1 

20 

In one operation of Epoch Combining two short Epochs of similar length 
and any amplitude are combined into a single long epoch that is labeled by the 
system as a double epoch. The conditions for executing this operation are: 

25 EpochLog.Lengtho <=50 AND EpochLog.Lengthi <= 50 AND 

( I EpochLog.Lengtho - EpochLog.Lengthi | <= 2 ) 
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When these conditions are met the following modifications are 
performed to combine the epochs with indices 0 and 1 into one epoch that is 
flagged as a Double Epoch by the addition of 200 to its length: 

5 

EpochLog.Lengtho = 200 + EpochLog.Lengtho + EpochLog-Length^ 
EpochLog.Locatiorio = EpochLog.Locationi 
Shift log entries with indicies >=1 one slot lower 
N = N-1 

10 

In another operation of Epoch Combining two short Epochs of dissimilar 
length and low amplitude are combined into a single long epoch that is labeled 
by the system as a Double Epoch. The conditions for executing this operation 
are: 

15 

EpochLog.Lengtho + EpochLog.Lengthj <= 100 AND 
EpochLog.EstRmso <=60 AND EpochLog.EstRmsj <= 60 
When these conditions are met the following modifications are 
performed to combine the epochs with indices 0 and 1 into one epoch that is 
20 flagged as a Double Epoch by the addition of 200 to its length: 

EpochLog.Lengtho = 200 + EpochLog.Lengtho + EpochLog.Lengthi 
EpochLog.Locationo = EpochLog.Locationj 
Shift log entries with indicies >=1 one slot lower 
25 N = N-1 
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In another operation of Epoch Combining two medium length Epochs of 
similar or dissimilar length, low amplitude, and presumed unvoiced speech are 
combined into a single long epoch that is not labeled as a double epoch. This 
operation is repeated one more time to provide more combining and hence 
more data rate reduction. The conditions for executing this operation employ 
the variable Previous_rcl which is exported from Primary Epoch Analysis unit 
30. They are: 

EpochLog.Lengtho + EpochLog.Lengthi <= 200 AND 
EpochLog.EstRmSo <=60 AND EpochLog.EstRmSj <= 60 AND 
Previous_rcl <0 

When these conditions are met the following modifications are 
performed: 

EpochLog.Lengtho = EpochLog.Lengthi + EpochLog.Lengthi 
EpochLog.Locationp = EpochLog.Location^ 
Shift log entries with indicies >=1 one slot lower 
N = N-1 

In another operation of Epoch Smoothing and Combining short epochs 
are duplicated and extended into a following region with Epoch Length==200, 
which is indicative of an absence of triggers. The conditions for executing this 
operation are: 

EpochLog.Lengthi = 200 AND 

( EpochLog.Lengtho < 80 OR EpochLog.Lengthi > 200 ) 
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When these conditions are met the following modifications are 
performed: 

If EpochLog.Lengthi < 50 

Shift log entries with indicies >=1 three slots higher 
EpochLog.Lengthi = EpochLog.Lengtho 
EpochLog.Lengthj = EpochLog.Lengtho 
EpochLog.Lengthg = EpochLog.Lengtho 
EpochLog.Length4 = 200 - 3*EpochLog.Lengtho 
EpochLog.Locationi = EpochLog.Locationo + EpochLog.Lengtho 
EpochLog.Locationj = EpochLog.Locationi + EpochLog.Length2 
EpochLog.Location3 = EpochLog.Locationj + EpochLog.Lengthj 
EpochLog.Location4 = EpochLog. Locations + EpochLog.Length4 
EpochLog.EstRmSi = EpochLog.EstRms4 
EpochLog.EstRms2 = EpochLog.EstRms^ 
EpochLog.EstRmSg = EpochLog.EstRms4 
N-N + 3 

If EpochLog.Lengthi < 80 

Shift log entries with indicies >-l two slots higher 
EpochLog.Lengthi = EpochLog.Lengtho 
EpochLog.Length2 = EpochLog.Lengtho 
EpochLog.Lengthg = 200 - 2*EpochLog.Lengtho 
EpochLog.Locationi = EpochLog.Locationg + EpochLog.Lengthi 
EpochLog.Locationj = EpochLog.Locationi + EpochLog.Lengthj 
EpochLog.Locationg = EpochLog.Locationj + EpochLog.Lengthj 
EpochLog.EstRmSi = EpochLog.EstRmSj 
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EpochLog.EstRmSj = EpochLog.EstRmSg 
N = N + 2 



If EpochLog.Lengthi > 200 
5 Shift log entries with indicies >=1 one slot higher 

EpochLog.Lengthi = EpochLog.Lengthg 
EpochLog.Lengtha = 200 - (EpochLog.Lengtho- 200) 
EpochLog.Locationi = EpochLog.Locatiorio + (EpochLog.Lengthi - 200) 
EpochLog.Locationj = EpochLog.Locationi + EpochLog.Length2 
10 EpochLog.EstRmsi = EpochLog.EstRms2 

N = N + 1 



In one embodiment, at the conclusion of Epoch Smoothing and 
Combining function 27 the values of EpochLog.Locatiorio and 

15 EpochLog.Lengtho are passed to Primary Epoch Analysis tmit 30. After the 
Primary and Secondary Epoch Analyses are completed all of the entries in 
EpochLog 28 are copied to RawEpochLog 26, the entry with index 0 is removed 
from the RawEpochLog (other entries are shifted one slot lower to fill the space 
and the length of the log is reduced by one). Processing then resumes with the 

20 next speech sample at the top left of the Epoch Locator illustrated in Figure 3. 
Primary Epoch Analysis unit 30 is illustrated in Figure 4. In one 
embodiment the Differential Encoding of Epoch Length 31 operates on the 
Epoch Length value for the current frame and the Epoch Length value, 
Previous_ Epoch_Length, from the previous frame to produce a 3-bit 

25 Differential Epoch Length value and in certain circumstances an 8-bit Encoded 
Epoch Length value created from the Epoch Length as follows: 
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RawEL_difference = Epoch Length - Previous_ Epoch_Length 
Differential Epoch Length ={RawEL_difference+3 if 
-3<RawEL_difference<3 
5 { 7 otherwise 

#Bits in Differential Epoch Length = 3 

#Bits in Encoded Epoch Length = { 0 if Differential Epoch Length < 7 
{ 8 otherwise 

Encoded Epoch Length = {Epoch Length if 16<= Epoch Length 

10 <=200 

{Epoch Length-231 if 232<= Epoch Length 

<=246 

{Epoch Length-46 if 247<= Epoch Length 

<=300 

15 (Eq.l6) 

The Differential Epoch Length, #Bits in Differential Epoch Length, and 
Priority=0 are sent to Frame Assembly unit 60 described below. The Encoded 
Epoch Length, #Bits in Encoded Epoch Length, and Priority=0 are also sent to 
20 Frame Assembly unit 60 described below. 

In one embodiment an operation in the Primary Epoch Analysis unit 30 
illustrated in Figure 4 is High Pass Filter 23 which is the same as that illustrated 
in Figure 3 and Eq. 12 with its output being the signal {p„}. Select Epoch 
25 Samples function 32 uses the Epoch Location and Epoch Length provided by 
Epoch Smoothing and Combining function 27 to extract samples from {p^} for 
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analysis. Since the Epoch Length provided may have 200 added to it to flag a 
double epoch, an Actual_Epoch_Length is first constructed as: 

Actual_Epoch_Length = { Epoch Length if Epoch Length < 200 



Then the raw epoch samples {e\} axe selected from {p^} to include the 
epoch defined by the input parameters plus 12 extra samples. The samples 
selected are offset by 5 samples from those defined by the input parameters to 
10 account for triggering typically occurring a few samples into the pulse that 
drives the epoch. {eW is selected according to the following equation: 



Compute and Remove Epoch Bias operation 33 operates as follows on 
the Raw Epoch Samples {e\] to produce the Bias Removed Epoch Samples {6^} 
as follows: 



5 



{ Epoch Length - 200 otherwise 




(Eq. 17) 



15 



20 



ActuaLEpoch_Length+ 1 1 



deb = (X e'k )/( Actual_Epoch_Length+12) 



(Eq. 18) 



= deb for k^O,l,. . .,Actual_Epoch_Length+ll 



(Eq. 19) 
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Compute RMS operation 34 determines the RMS (root mean square) of 
the signal {e^,} as follows: 

Actual_Epoch_Length-l 

5 RMS= [ (X ek+i2*ei,+i2 )/(Actual_Epoch_Length)]^^^ (Eq. 20) 

k=0 

In one embodiment Log Encoding 35 of the RMS operates according to 
the following equation to produce the LogRMS as an integer in the range 0 to 
10 31: 

LogRMS = Integer(2.667* Log2(RMS)) (Eq. 21) 

LogRMS = { 31 if LogRMS > 31 

{ LogRMS otherwise (Eq. 22) 

1 5 In one embodiment Differential Encoding of the LogRMS 36 operates on 

the RMS value for the current frame and the LogRMS value, 
Previous_LogRMS, from the previous frame to produce a 2-bit Differential 
LogRMS value and in certain circumstances a 5-bit Absolute LogRMS value as 
follows: 

20 

RawRMS_difference = LogRMS - Previous_ LogRMS 
Differential LogRMS = { RawRMS_difference-f-l if 
-1 <Ra wRMS_dif f erence < 1 

{ 3 otherwise 
25 #Bits in Differential LogRMS = 2 
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#Bits in Absolute LogRMS = { 0 if Differential LogRMS < 3 

{ 5 otherwise (Eq. 23) 

The Differential LogRMS, #Bits in Differential LogRMS, and Priority =0 
5 are sent to Frame Assembly unit 60 described below. The Absolute LogRMS, 
#Bits in Absolute LogRMS, and Priority=0 are also sent to Frame Assembly 
unit 60 described below. 

In one embodiment Compute Covariance Matrix operation 37 operates 
on the Bias Removed Epoch Samples {ei,} to create a 12x12 covariance matrix, 
PHI, and a 12x1 vector, PSI, for the current epoch. This operation is well- 
known prior art for which a discussion may be found in Deller, John R., 
Hansen, John H. L., Proakis, John G., Discrete Time Processing of Speech Signals, 
pp292-296, IEEE Press, New York, New York, 1993. Since the matrix PHI is 
symmetric about the diagonal, only the lower triangular half need be 
computed. The present invention implements this technique as follows: 

Actual_Epoch_Length+10 

PHIr,c= (S ek.c*ek^) forr = 0,L...,ll andc = 0,l,..,r (Eq. 24) 

20 

ActuaLEpoch_Length+ll 

PSIc= (X ek.c.i*ek) fore = 0,1, ...,11 (Eq. 25) 

k=12 



10 
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PHI and PSI are passed to Invert Matrix operation 38 which employs the 
iterative Choleski decomposition method to produce 12 Reflection Coefficients 
(RCs) according to the following procedure which is well-known prior art (see 
for example Deller, Hansen & Proakis, 1993, pp296-313). In this procedure the 
5 constant eps = .0001 is used to detect a singular or near singular matrix which 
has no inverse. In this case the technique terminates prior to completing the 
computation of all 12 RCs and sets the remaining RCs to zero. The procedure 
is given in pseudo C programming code: 



10 for(j=0;j<12;j++) { 

for(k=0; k<j; k++) { 

save = PHI[j][k] * PHI[k][k]; 

for{i=j; i<12; i++) PHI[i][j] = PHI[i][j] - PHI[i][k] * save; 

} 

15 

if( I PHI[i][i] I < eps ) break; 
RC[j] = PSI[j]; 

for(k=0; k<i; k++) RC[j] = RC[j] - RC[k] * PHI[i][k]; 

PHI[j][j] = 1.0 / PHI[j][j]; 
20 RC[j] = RC[j]'^PHI[j][j]; 

RC[j] = Minimum(0.986,RC[j]); 
RC[j] = Maximum(-0.986,RC[j]); 

} 

if( I PHI[j][j] I < eps ) for{i=i; i<12; i++) RC[i] = 0.; (Eq. 26) 
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In one embodinaent the resulting 12 RC values each lie in the range 
-0.986 to 0.986. These 12 RC values are passed to FrameType Logic 39 for 
determination of the type of channel quantization to use and to Quantize RCs 
process 40 for the actual channel encoding. 

5 

In one embodiment FrameType Logic 39 examines the current frame's 
LogRMS value and the value of RCq to determine if a full frame (12 RCs plus 
Residue Descriptor) or a half frame (6 RCs with no Residue Descriptor) should 
be forwarded to the Decoder. This distinction is made to conserve significant 

10 bandwidth at the cost of minor signal degradation at the Decoder output. In 
the absence of bandwidth constraints it would be desirable to use full frames 
for all output. Each frame is initially assumed to be a half frame. The 
condition for declaring a full frame in FrameType Logic 39 employs a constant 
RMSThold which for typical telephone digital signals is advantageously set to 

15 20. Higher values may be used with a resultant loss of signal quality at the 
Decoder output. Lower values of RMSThold result in a higher channel 
bandwidth and increased signal quality at the Decoder output. The condition 
implemented for declaring a full frame type in FrameType Logic 39 is: 

20 RCO >=0 AND LogRMS > RMSThold (Eq. 26a) 

Quantize RCs process 40 encodes the Raw RCs as created in Eq. 26 into 
integer values on limited ranges suitable for transmission with a minimal 
number of bits. Techniques for such a process are well-known in the prior art. 
25 See for example the discussion in O'Shaughnessy, Douglas, Speech 
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Communication: Human and Machine, p. 356, Addison- Wesley, New York, New 
York, 1987. 

In one embodiraent the first two RCs (RCq and RCj ) are encoded by quantizing 
the log area ratios of the RCs rather than the RCs themselves. This log area 
5 ratio encoding provides more resolution when the RC values are near + 1 or -1, 
the regions in which small changes in RC value have the greatest perceptual 
effects. The Log area ratio function is given as: 



Lar, = Log, ( (1 + Rq)/(1 - RC,) )" (Eq. 27) 

The remaining RCs are encoded linearly from their Raw values. 
Quantize RCs process 40 computes both the encoded values { qvj , for j=0 to 11} 
for transmission and the reconstructed quantized RCs {qRCj , for j=0 to 11} that 
equal the RCs that will be reconstructed in the Decoder. 

In one embodiment the Quantization process constrains each RCj to a 
predetermined range given by the values HiClampj and LowClampj as shown 
in Table 2 below. 
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j 


0 


1 


2 


3 


HiClamp 


0.986 


0.986 


0.9 


0.9 


LoClamp 


-0.986 


-0.986 


-0.9 


-0.9 




j 


4 




6 


7 


HiClamp 


0.9 


0.75 


0.75 


0.75 


LoClamp 


-0.9 


-0.9 


-0.75 


-0.75 




j 


8 


9 


10 


11 


HiClamp 


0.75 


0.75 


0.7 


0.7 


LoClamp 


-0.75 


-0.75 


-0.7 


-0.7 



5 

The number of bits used to encode each RC, is a function of j and the 
frame type: full or half as shown in Table 3 below. 



Table 3. Bit Allocations for RCs 



J 


0 


1 


2 


3 


BitsFuU 


7 


7 


6 


6 


BitsHalf 


6 


6 


5 


5 


J 


4 


5 


6 


7 


BitsFuU 


5 


5 


4 


4 


BitsHalf 


4 


4 


0 


0 


J 


8 


9 


10 


11 


BitsFuU 


4 


4 


3 


3 
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BitsHalf 


0 


0 


0 


0 



The Process for quantizing and encoding RQ and RCiis given below: 

RCj = Maximum( LoClampj, Rq ) 
RC, = Minimum( HiClampj, Rq ) 

qVj = { Integer( 12.57 * Larj ) for full frames 
{ Integer( 6.285 * Larj ) for half frames 

aj = {exp(qvj / 12.57) for full frames 
{exp{qVj / 6.285) for half frames 

qRq= (a, -1) / (a^ + 1) (Eq. 27a) 

This encoding results in qv^ values which require 7 bits for transmission 
in full frames and 6 bits for transmission in half frames. 

The process for quantizing and encoding RCj through RC5 is given 

below: 

Rq = Maximum( LoClampj, Rq ) 
Rq = Minimum( HiClampj, Rq ) 

qv, = { Integer( (2=^*BitsFull,-l)=^ (Rq - LoClampj)/(HiClamp, - 
LoClampj) ) 

for full frames 
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{ Integer((2*=^BitsHalfj-l)* (Rq- LoClampj)/(HiClainpj - 

LoClampj) ) 

for half frames 

5 qRCj = { LoClampj + (HiClamp, - LoClampj) * (qvj +.5)/ (2**BitsFull^-l) 

{ for full 

frames 

{ LoClampj + (HiClampj - LoClamp,) * (qVj +.5)/ (2=^*BitsHalf,-l) 
{ for half 

10 frames 

(Eq. 27b) 

This encoding results in qv^ values which require BitsFulI^ bits for 
transmission in full frames and BitsHalfj bits for transmission in half frames. 

15 

The process for quantizing and encoding RCg through RC^ is given 

below: 

RC, = Maximum( LoClampj, Rq ) 
Rq = Minimum( HiClamp,, Rq ) 

20 

qVj = { Integer( (2'^'^BitsFull, -l)*(Rq - LoClamp,)/(HiClampj - LoClamp) 

) 

{ for full frames 

{ 0 for half frames 

25 

qRq = { LoClamp + (HiClamp^ - LoClamp) * (qVj +.5)/ (2=^*BitsFullj -1) 
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{ for full 

frames 

{ 0 for half 

frames 

(Eq. 27c) 

This encoding results in qv, values which require BitsFull^ bits for 
transmission in full frames and 0 bits for transmission in half frames. 

The reconstructed quantized RCs {qRCj , for j=0 to 11} are passed RC 
Priority Logic 41 and to Secondary Epoch Analysis 50. The encoded values { 
qVj , for i=0 to 11} are passed to Frame Assembly unit 60. 



RC Priority Logic 41 determines the importance of the RCs in a 
15 particular frame to the quality of the reconstructed speech at the Decoder. In 
one embodiment frames are assigned a priority in the range 0 to 15. Frames 
with minimal importance are assigned a priority of 15, while frames of greatest 
importance are assigned a priority of 0. The RC Priority Logic computes two 
measures of distance on the qRCs: rcdif and rcdifO. The distance is computed 
20 between the current frame and last frame that would have been transmitted to 
the Decoder when only priorities of 2 or less are transmitted. Whenever a 
frame is encountered that is assigned a priority of 2 or less its {qRCj} and {qVj} 
values become the reference set {ref_qRCj} and {ref_qVj} for computing distance 
and hence priorities in succeeding frames. The distance measures are 
25 computed as follows: 
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11 



rcdif= X |qvk — ref_qF| 



(Eq. 28) 



k=0 



rcdifO = I qRCo - ref_qRCo j 



(Eq. 29) 



Priority Logic 41 then employs an empirically derived constant 
RCDropTH which is used to tune the overall data rate range of the system. In 
one embodiment RCDropTH is set to 110 which results in average charmel data 
rates on typical telephone conversations of approximately 1600 bps when only 
parameters with priority of 2 or less are transmitted and average rates of 
approximately 3200 bps when all parameters are transmitted through the 
channel. The priority value to be assigned to RCs in the current frame is 
determined as follows: 

Rcdimport = rcdif / RCDropTH 
RcOimport = rcdif 0 /{ RCDropTH / 700 ) 

Reimport = { Maximum(Rcdimport, RcOimport) if LogRMS > 
RMSThold 



Rcpriority = ( 2 + 15* (1- Reimport) ) 
Rcpriority = Maximum ( 1, Rcpriority) 
Rcpriority = Minimum ( 15, Rcpriority) 
If (current frame is full and previous frame was half OR 
current frame is half and previous frame was full ) 



{ Rcdimport 



otherwise 
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Rcpriority= {1 if LogRMS <= RMSThold 

{ 0 otherwise 

(Eq. 30) 

5 The Rcpriority is forwarded to the Frame Assembly unit 60. 

Secondary Epoch Analysis 50 computes a Residue Descriptor parameter 
that is transmitted in full frames only and acts as a fine tuning of the Epoch 
Length by controlling the position at which the Decoder places the excitation 
1 0 pulse in the reconstructed Epoch's excitation. 

Secondary Epoch Analysis 50 proceeds as shown in Figure 5 in which 
the quantized RCs {qRC^} as computed by Quantize RCs process 40 are 
converted to predictor coefficients {pq ; for j=0,. . .,11} by Convert RCs to PCs 
15 operation 51. This operation is carried out as follows: 

pco = qRCo; 

for(i-l; i<12; i++) { 

for(j=0; j<i; j++) tempj = pCj - qRC^ * pc^.^.i', 
20 for(j=0; j<i; pq = temp^; 

pCi = qRCi; 

} (Eq. 31) 

The predictor coefficients are then used to inverse filter the Bias 
25 Removed Epoch Samples {e^} to produce a residue signal {rj,}. This is 
accomplished in Perform Inverse Filter operation 52 as follows: 
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11 

Tk = Ck - X pcj * ek-j-i for k=12,13, ,Actual_Epoch_Length +11 (Eq. 32) 



The residue {r^,} represents the excitation signal required to drive a filter 
built with the predictor coefficients to reconstruct the input signal. The 
Decoder will attempt to approximate this residue in the process of 
reconstructing the speech. The only parameter derived from the residue is and 
10 estimate of the location of the pulse within the epoch. This is determined as 
follows by Locate Peak function 53: 



ResPeak = 0; 
PeakLoc = 0; 
15 f or ( j=0; )< Actual_Epoch_Length; 

if( -rj > ResPeak){ ResPeak = -rj ; PeakLoc = j;} 



The Peak location, PeakLoc, is encoded for transmission in 4 bits by the 
following actions in Encode Peak Location operation 54: 

20 

If ( PeakLoc > Actual_Epoch_Length/2 ) 
ResDesc = { PeakLoc - Actual_Epoch_Length 

If PeakLoc > Actual_Epoch_Length/2 
{ PeakLoc otherwise 
25 ResDesc = Maximum( ResDesc, -7) 
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ResDesc = Minimum( ResDesc, 8) 
ResDesc = ResDesc + 7 

The resulting range for ResDesc is 0 to 15, which can be encoded in 4 
5 bits. ResDesc is forwarded to Frame Assembly unit 60 where it assumes the 
priority, Rcpriority, from Eq. 30. Its number of bits will be 4 in full frames and 
0 in half frames. 

Frame Assembly unit 60 of Figure 1 is the final Encoder operation in 
10 preparing a frame of data for transmission. Two modes of operation are 
possible for this module depending on whether or not a network traffic 
management function is co-resident with the Encoder or remotely located. 

In the case of a co-resident traffic manager (e.g. a traffic manager to 
1 5 which the Encoder communicates over a high bandwidth channel) the Frame 
Assembly process assembles into a standard format and forwards parameter 
values, parameter encoding specifications (number of bits per parameter) and 
parameter priorities to the traffic manager. The speech data in this format 
requires approximately 56kbps for transmission to the traffic manager. The 
20 traffic manager then selects a priority level that provides the maximum output 
speech quality for the available bandwidth. After a priority has been selected, 
the traffic manager selects only those bits corresponding to encoded parameter 
values with priorities at or below the requested priority value for transmission. 
The priorities themselves and the number of bits per parameter are not 
25 forwarded over the channel. The resulting transmission data rate varies from 
about 1600 bps to 3200 bps depending on the priority level employed. There 
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are many factors beyond the scope of the present invention that may be 
brought to bear in setting the available bandwidth for a given speech channel. 
They include network congestion, bandwidth cost, and the channel user's 
requested (contracted for) quality of service. It will be appreciated that in one 
5 embodiment the priority level governing transmission rate may be dynamically 
varied from frame to frame to meet rapidly changing network conditions. 



In the case of a remotely located traffic manager, the traffic manager 
forwards a requested priority level to the Encoder, which then performs the bit 
10 stripping and packing operation itself to produce a low-rate bit stream for 

transmission. Since this bit stream no longer has priority information included, 
the network cannot further modify it. 



A standard format frame with priority and bit size information included 
15 is a block of 64 bytes laid out as in Table 4 below which gives the possible 
values for each byte in the frame. 
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Table 4. Possible Values for Each Entry in Parameter Frame 



Parameter Name 


Prioritv 


#Bits 


Value 






Include RCs:IRC 


0 


1 


0,1 


EpochLen Delta 
Flag:EDF 


0 


3 


0->7 


Flag:RDF 


0 


2 


0->3 


EpochLength 


0 


3^8 


0->255 


RMS 


3 


0 5 


3->31 


RCl 


3->15 


6 7 


3->127 


RC2 


3->15 


6 7 


3->127 


RC3 


D->15 


5 6 


3->63 




3->15 


5 6 


3->63 




3->15 


4 5 


0->31 


RC6 


0->15 


4 5 


0->31 


RC7 


0->15 


0 4 


0->15 


RC8 


0->15 


0,4 


0->15 


RC9 


0->15 


0,4 


0->15 


RCIO 


0->15 


0,4 


0->15 


RCll 


0->15 


0 3 


0->7 


RC12 


0->15 


0,3 


0->7 


ResDesc 


0->15 


0,4 


0->15 


Unused 


0 


0 


0 


Unused 


0 


0 


0 


Unused 


0 


0 


0 
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jUnused jO ^ | | 

Each of the parameters listed in Table 4 corresponds to some number of 
bits which may or may not be included the bit stream sent to the Decoder. The 
designation 0-bits implies that the parameter is not sent at all. The Include RC 
5 Flag (IRC) is initially set to 1. When the traffic manager (or Encoder) "drops" 
RCs based on their priority level the IRC bit is set to 0 to flag the absence of the 
RCs for the given frame. Note that all RCs and the RescDesc within a given 
frame have the same priority number, thus all are kept or dropped as a group. 

1 0 In the case of a remote traffic manager, which has supplied a particular 

priority level to the Encoder, the following operations are performed in the 
Encoder to produce the bits sent to the digital transmission medium. These 
same operations are performed by a co-resident traffic manager operating on 
the 64 byte frame block. 

15 

The Encoder first compares the priority assigned to the RCs and 
ResDesc in the frame to the requested (or allowed) priority for transmission. If 
the priority for the RCs for this frame is less than or equal to the requested 
priority all RCs are to be retained. If the priority for the RCs for this frame is 

20 greater than the requested priority all RCs are to be dropped. This determines 
the value of the IRC bit. Conversion from the frame structure to a bit stream 
then proceeds from top to bottom in the 64 bytes frame examining each triplet 
of priority, #bits, and value. If priority is greater than the requested priority 
the triplet is skipped. If priority is less than or equal to the requested priority 

25 the number of bits specified by the # Bits column are extracted from the low 
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end of the Value byte and forwarded to the bit stream. Figure 9 summarizes 
the structure of the resulting bit stream. It will be appreciated that this 
translation in this order to the bit stream results in a bit stream which is 
uniquely decodable at the receiving end into the individual parameters as 
5 discussed below under the Decoder operation. It will also be appreciated that 
there are other arrangements of the bits which provide unique decodability and 
may be advantageous in certain other implementations. In particular in 
environments with noticeable error rates imparted by the digital transmission 
medium, it will be advantageous to encode the IRC, EDF, RDF, and first bit of 
10 RCl with error detection and correction codes to ensure rapid recovery of 
frame synchronization after channel errors occur. 

Figure 2 illustrates a block diagram of a Decoder in one embodiment of 
the invention. The Decoder consists of Frame Disassembly and Decoding unit 

15 100 to reconstruct parameters from the digital bit stream. Excitation Generator 
110 to construct an excitation signal. Synthesizing Filter 130 to filter excitation 
signal 122 producing Raw Output Signal 136, and Output Scaling and Filtering 
unit 140 to transform Raw Output Signal 136 into final Output Audio 148. At 
the receiving (decompression) side the Decoder reconstructs each frame of 

20 (typically) 15 parameters for each channel, flags the parameters that are 

missing (were not sent due to bandwidth limitations over the Frame Relay 
link), and presents the frame to a Synthesizer for reconstruction of the speech. 

Frame Disassembly and Decoding unit 100 accepts the incoming bit 
stream, disassembles it into individual frames and individual parameters 

25 within each frame and decodes those parameters into formats useful for 
synthesis of speech corresponding to the input Epoch. 
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The first task in Frame Disassembly is the identification of the total 
length of the frame and the location of individual parameters in the frame's bit 
stream. To this end, with reference to the structures displayed in Figure 9, the 
IRC bit is first examined to determine the presence or absence of RC Block. The 
5 next 3 bits are the EDF(Epoch Length Delta Flag). If the EDF is 7 there are 8 
bits of Encoded Epoch Length following the RDF. The next 2 bits are the RDF 
(RMS Delta Flag). If the RDF is 3, then the RMS absolute value is included as 5 
bits following either the Epoch Length(if present) or the RDF (if no Epoch 
Length). These operations have established the length and structure of the ER 
10 Header (Epoch_Length RMS Header). The values in the ER Header are now 
decoded as follows: 

Epoch Length = previous Epoch Length + EDF - 3 if EDF <7 

Epoch Length = ELcode if EDF=7 and 16 <= ELcode <= 

15 200 

Epoch Length = Elcode+231 if EDF=7 and ELcode < 16 

Epoch Length = ELcode +46 if EDF=7 and ELcode > 200 

LogRMS = previous LogRMS + RDF - 1 if RDF < 3 
20 LogRMS = RMScode if RDF = 3 (Eq. 33) 

If the IRC bit is 0, the frame ends with the ER Header. Otherwise the 
frame contains an RC Block with length and format established by the value of 
the decoded LogRMS and the value of the first bit in the RC Block, which is the 
25 sign bit of RCq. If the LogRMS is greater than the RMSThold (as described in 
conjunction with Eq. 26a above) and the first bit of the RC Block is 0, the RC 
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Block is a full frame coiitairiing 62 bits laid out as illustrated in Figure 9. If the 
LogRMS is less than or equal to the RMSThold or the first bit of the RC Block is 
1, the RC Block is a half frame containing 30 bits laid out as illustrated in Figure 
9. 

5 

The individual RCs if present in the frame are decoded from their 
transmitted values {qvj} to produce the set {qRCj} according to Eqs. 27a, 27b, 
and 27c above. 

10 The LogRMS is decoded into a linear RMS approximation by using the 

LogRMS value (an integer on [0,31]) as an index into the following table: 

Table 5. 



LogRMS 


0 


1 


2 


3 


4 


5 


6 


7 


RMS 


0 


1 


2 


3 


4 


5 


6 


7 




LogRMS 


8 


9 


10 


11 


12 


13 


14 


15 


RMS 


9 


12 


15 


21 


27 


35 


43 


57 




LogRMS 


16 


17 


18 


19 


20 


21 


22 


23 


RMS 


73 


94 


126 


160 


204 


267 


346 


454 




LogRMS 


24 


25 


26 


27 


28 


29 


30 


31 


RMS 


587 


756 


984 


1,245 


1,606 


2,072 


2,646 


3,387 
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The Epoch Length, RMS, and decoded RCs, {qRCj}, along with a flag 
indicating if the RCs are present or not are passed to Excitation Generator 110, 
Synthesizing Filter 130, and Output Scaling and Filtering 140 as illustrated in 
5 Figure 2. 

Excitation Generator 100 illustrated in Figure 6 begins by decoding the 
Epoch Length in Decode Epoch Length function 120 to determine the actual 
number of samples in the Epoch and whether or not the Epoch is a Double. 
10 This is accomplished as follows: 

Actual_Epoch_Length = {Epoch Length if Epoch Length <=200 

{ Epoch Length - 200 otherwise 

15 Double Epoch Flag = {False if Epoch Length <=200 

{ True otherwise 

(Eq. 34) 

Excitation Generator 100 then executes Calculate Epoch Length 
Dispersion function 111 to calculate an EpochLength Consistency factor that 
20 measures the consistency versus dispersion of the successive epoch lengths as 
follows: 

True Epoch Length = { Actual_Epoch_Length if Double Flag is 

False 

{ (Acutal_Epoch_Length)/2 otherwise 



25 



dl = (I Log(Previous True Epoch Length/True Epoch Length) | 
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{ if True Epoch Length < 200 and 

{ Previous True Epoch Length 

<200 

{ 2.5 otherwise 

5 

dispersion = Previous dl + dl 

EpochLength Consistency = { 1 - (dispersion/2) if dispersion < 

2.0 

10 {0 otherwise 

(Eq. 35) 

The EpochLength Consistency factor has values near 1.0 for voiced 
signals and near 0 for unvoiced signals. 

1 5 The Raw Mixing Fraction is computed in Estimate Mixing Fraction 

operation 112 from the first RC, RCq , as follows: 

Alphas {.9 ifRCo>=0.2 

{ (RCo + 0.4)*1.5 if -0.4 < RCo < 0.2 
20 { 0 else 

Raw Mixing Fraction = (Alpha + Previous Alpha)/ 2 

(Eq. 36) 

The Raw Mixing Fraction has values near 1.0 for voiced signals and near 
25 0 for unvoiced signals. 
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Refine Mixing Fraction operation 113 combines the Raw Mixing Fraction 
and EpochLength Consistency to produce a Mixing Fraction as follows: 

{(EpochLength Consistency) *( Raw Mixing Fraction) 
5 { if Raw Mixing Fraction < 0.8 

Mixing Fraction = {(EpochLength Consistency) *( Raw Mixing Fraction +0.2) 

{ if 0.8 <Raw Mixing Fraction <= 

0.9 

{(EpochLength Consistency) *( Raw Mixing Fraction +0.4) 
10 { if 0.9 <Raw Mixing Fraction 

(Eq.37) 

The pulse portion of the excitation is created by first selecting the final 
12 points of the previous unshaped synthesized audio signal {Un} described 
below. This signal, which is used to provide history to Synthesizing filter 133, 
1 5 needs to be adjusted by the relative gain levels of the previous and current 
epochs. This is accomplished in Scale Tail of Excitation from Previous Epoch 
114 as follows: 

TaiLscale = { PreviousRMS / RMS if PreviousRMS/RMS <= 

20 4.0 

{ 4.0 / otherwise 

exq = TaiLscale * previous_exc previous Actuai_Epoch_Length + j i=0,l, . • . /1 1 

(Eq. 38) 
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Next a fixed shape excitation pulse is used to provide the body of the 
pulse portion of the excitation in Copy Single or Double Pulse operation 115. 
To this end a fixed excitation source signal, dkexCk}, is stored in advance as: 

5 dkexck= {{-98, -66,-130, -9,-233, 174,-537, 558,-741, 104) k=0,...9 

{( 477,-578,-669, -5, 554, 643, 443, 200, 70, 29) k=10,...19 
{( 13, -29, -83, -126, -81) 
k=20,...24 

{{0,0,..,0) k=25,...199 

10 (Eq.39) 

The excitation signal {exC|} is the filled in according to: 

If Double Flag is False 
15 exCj+12 = dkexCj j=0,...,Actual_Epoch_Length-l 

If Double Flag is True 

Halfl = Integer((Actual_Epoch_Length )/2) 
= dkexCj j=0,..., Halfl -1 

20 

exCj+;i2+Haifi ~ dkexCj j— 0,..., Actual_Epoch_Leng|th - Halfl —1 

(Eq. 40) 

Remove DC Bias operation 116 then removes any DC Bias from the non- 
tail portion of the pulse excitation as follows: 

25 

Actual_Epoch_Length-l 
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exCj+12 = excj+12 



- (l/(Actual_Epoch_Length)) 2 exck+12 



for j = 0, , Actual_Epoch_Length — 1 
(Eq-41) 

5 

Time Shift operation 117 employs the Residue Descriptor information to 
shift the location of the pulse(s) in the excitation to more nearly match the pulse 
alignment within the epoch in the original residue in the Encoder as follows 
using a circular shift: 

10 

6XCj+i2 — (j+i2+ResDesc+2) mod (Actual Epoch Length) (Eq- 42) 

This completes the creation of the pulsed portion of the excitation. In 
one embodiment the noise portion of the excitation {uvnk} is created using a 
1 5 Random Number Generator Rrnd() that generates numbers uniformly 
distributed on the range (-32768, +32767). 

uvUk = RrandO /256 for k = 0, . . . , Actual_Epoch_Length -1 (Eq, 43) 

Any convenient Random Number Generator with suitable properties 
20 may be used, an exemplary Random Number Generator based on Knuth, D., 
The Art of Computer Programming, Fundamental Algorithms, Vol. 2, p. 27, 
Addisson- Wesley, New York, 1998, is given in C programming code by: 

int Rrand () 



45 



Docket No. 05313P002 



Express Mail No. EM14067308US 



int the_random; 

static short y[5]={-21161, -8478, 30892,-10216, 16950}; 
static int j=l, k=4; 

/* The following is a 16 bit 2's complement addition, 
with overflow checking disabled*/ 

y[k] y[j]; 

if(y[k] > 32767) y[k] = -(32768 - (y[k] & 32767)); 

if(y[k] < -32768) y[k] = y[k] & 32767; 

the_random = y[k]; 

k-; if(k<0)k = 4; 

j-; if(j<0)j = 4; 

return(the_random); 

} (Eq. 44) 

Final excitation signal 122 is created from the noise signal {uvn^} and 
pulse portion {excj,} via scaling 119,120 and a summing operation 121 as 
follows: 

exCj+12 = Mixing Fraction * exc,+i2 + (1- Mixing Fraction) * uvn^ 

for j=0,. . .,Actual_Epoch_Length-l 
(Eq.45) 

Synthesizing Filter 130 is illustrated in Figure 7 where the first 
operation. Convert RCs to PCs 131, is accomplished using the technique in the 
Encoder's Secondary Analysis as specified in Eq. 31. The predictor coefficients. 
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{pCj, j=0,. . .,11} are then employed in Apply PC Filter to Excitation operation 133 
to filter Excitaiton Signal 122 thus producing Unshaped Synthesized Audio 
Signal , {Un} 134, according to the following equation: 

5 Un = exCn+12 + S pcj * EXCn+12-j-i fot n=0, ,Actual_Epoch_Lengtli — 1 



(Eq.46) 



Unshaped Sjmthesized Audio Signal , {11^} 134, is then subjected to a 
10 filter with fixed coefficients which boosts low frequencies in Low Frequency 
Spectral Shaping Filter, 135, to produce Raw Output Signal {ros^} 136 as 
follows: 



ros,= U„ - 2.39399 * U,.i + 2.249895 * U,.2 - 0.967 * U,.3 + 0.1681 * U,.^ + 
1 5 2.40557 * ros„.i - 2.233958 * ros„.2 + 0.9051 * ros^.g - 0.1336 * ros^_4 

for n=0,. . .,Actual_Epoch_Length-l 
(Eq. 47) 



Output Scaling and Filtering operation 140 is illustrated in Figure 8 in 
20 which the first operation. Compute RMS of Raw Output Signal 141, proceeds 
to compute Rosrms as follows: 

Actual_Epoch_Length- 1 

Rosrms = [ ( X rosn * rosn ) / Actual_Epoch_Length ] ~ 
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(Eq. 48) 



Compute RMS Scale Factor operation 143 then computes a gain for the 
current epoch from the input RMS and the Rosrms as follows: 

5 

Gain = RMS / Rosrms 

(Eq.49) 



The Raw Output Signal is then scaled by the Gain in the operation at 145 
10 to produce the signal {grjvia: 

gr^ = Gain * ros^ for n = 0,. . .,Actual_Epoch_Length -1 (Eq. 50) 

The Gain scaled signal , {grj, is then filtered by Low Pass Filter 147 to 
15 produce final Output Audio, {0^} 148, according to the following equation: 



0„ = 0.4* gr^ + 0.2 * gr^.i + 0.5 * O^.i for n=0,. . .,Actual_Epoch_Length-l 

(Eq. 51) 

The output audio signal can then be forwarded to various mechanisms, 
20 such as a Digital to Analog (D/A) converter, amplifier, and speaker, that 
present the signal to a receiving end-user. 

Therefore, a mechanism by which traffic management systems external 
to an embodiment may meet the needs of rapidly changing network conditions 
by dynamically varying the bandwidth allocated to a given channel of signal 
25 activity, such as speech, or audio activity, with predictable influence on the 
quality of the reconstructed (received) signal. The present invention can 
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support Quality of Service (QoS) protocols in which end-users trade-off speech 
quality versus cost of service. 

Further, the present invention flags portions of the digital signal as 
deletable from the bit stream and identifies the effects that each such deletion 
5 will have on the output speech quality. 

Thus, by meeting the demands of a transmission medium's dynamically 
changing bandwidth (of transmission rates) by compressing signals in 
accordance with the dynamically changing bandwidth, communication over 
the medium is carried out in a manner that maximizes the quality of the 
10 reconstructed signal. 

The above embodiments can also be stored on a device or medium and 
read by a machine to perform instructions. The device or medium may include 
a solid state memory device and /or a rotating magnetic or optical disk. The 
device or medium may be distributed when partitions of instructions have 
1 5 been separated into different machines, such as across an interconnection of 
computers. 

While certain exemplary embodiments have been described and shown 
in the accompanying drawings, it is to be understood that such embodiments 
are merely illustrative of and not restrictive on the broad invention, and that 
20 this invention not be limited to the specific constructions and arrangements 
shown and described, since various other modifications may occur to those 
ordinarily skilled in the art. 
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